Automatic Classification of Music Genres Using the Deep Learning Approach

Authors: Sheetal Janthakal, Bharath P I, Bineeth Rathod H A, B Sai Venkata Chaitanya, Bhardwaj Tejaswi G

DOI Link: https://doi.org/10.22214/ijraset.2023.51683

Abstract

Music is divided into arbitrary groups known as genres. Music genre classification is a challenging task due to the subjective and ambiguous nature of musical genres. The existing systems for music genre classification suffer from low accuracy and poor generalization of new data. Therefore, there is a need to develop a robust and accurate machine-learning model that can overcome these challenges and classify music audio files into different genres with high accuracy. The main aim of the Music genre classification project is to develop a user-friendly application that accepts audio files as input and classifies the audio file into a particular category of sound to which they belong (to predict its genre) using machine learning models. This application automates the process to reduce manual error and time. It will take an audio file as input and categorizes each file into a particular category like audio belonging to Disco, hip-hop, etc. The final classification is obtained from the collection of individual data. This machine learning model makes use of Support Vector Machine(SVM) and Logistic Regression models. Both models will be integrated into a website to make the project easily accessible.

Introduction

I. INTRODUCTION

People are finding it harder and harder to manage the music they listen to as internet music databases continue to expand and become more widely accessible. One way to categorize and organize songs is based on the genre, which is identified by some characteristics of the music such as rhythmic structure, harmonic content, and instrumentation. Music genre classification has been a frequently explored subject of research from the early days of the Internet. This issue was addressed using supervised machine-learning techniques. They introduced 3 sets of features for this task: timbral structure, rhythmic content, and pitch content. Hidden Markov Models (HMMs), which have been extensively used for speech recognition tasks, have also been explored for music genre classi?cation. our aim of the project is to build an application of machine learning (ML) algorithms to identify and classify the genre of a given audio file. The best-predicting model will then be used for predictions. The final model will be integrated with the application to display the outputs of the classification.

II. LITERATURE REVIEW

Much research has been carried out for analyzing which method is best for the classification of music genres. Some of them are as follows.

In paper [1] The author used the K-Nearest Neighbour algorithm, Random Forest algorithm, Support Vector, and Artificial Neural Network and some good accuracy has been achieved. The author describes that when the target music genre number is increased it has been observed that the success rate has decreased. The success rates were only achieved in Reggae and Hip-hop music genres. The author specifies the reason for these situations was the essence used in the study. As a result of the research, success rates will increase by providing a good dataset.

In Article [2] Convolution Neural Network based automatic music genre classification system. The feature vectors were calculated using Mel Spectrum and MLCC. The python based libras package helps in extracting the features and thus helps in providing good parameters for the network training. Thus the author concludes that methodology is promising for the classification of the huge database of songs into the respective genre. The author recommends future work to be done on developing the system further to classify the songs based on mood. This will help find out which kind of music can reduce stress in a person while listening to it. This might be helpful in music therapy according to the author, which can be used for playing a particular music depending on the person’s stress level.

In article [3] The proposed research work has compared a few classification models and established a new model for CNN, which is better than previously proposed models. This research work has trained and compared the proposed models on the GTZAN dataset, where most of the models were audio file trains, while a few of the models were trained on the spectrogram.

The proposed research work has utilized the GTZAN dataset and produced multiple models to complete this task in this piece of music classification. The proposed model has used multiple inputs for various models along with the audio Mel spectrogram and transferred this to our CNN, and various sound file characteristics stored in the ANN, SVM, MLP, and Decision Tree The author concludes that some styles were quite distinctive and some rather distinctive such as the country and the rock genre were confused with other styles, although traditional and blues were easily identified.

In paper [4] Music plays a vital role in people's lives. Music unites like-minded people and is the glue that binds communities together. Communities can be identified by the type of songs that are composed or listened to. In this project, the author has built an in-depth learning project to automatically distinguish genres of music from various audio files. The author classifies these audio files using features at a low-frequency level and time zone. In this project, the author has also built several segmentation models and trained them on the GTZAN database. The database had 1000 type audio tracks and the time duration of each track is 30 seconds. It contained 10 types of music genres, and each music genre contained 100 tracks.

In work [5] This research provides a comparative study of the genre classification performance of deep-learning and traditional machine-learning models. Furthermore, The author investigates the performance of machine-learning models implemented on three-second duration features, to that of those implemented on thirty- seconds duration features. The author presents the categories of features utilized for automatic genre classification and implements Information Gain Ranking algorithm to determine the features most contributing to the correct classification of a music piece. Machine-learning models and Convolutional Neural Networks (CNN) were then trained and tested on ten GTZAN dataset genres by the author.

In paper [6] Classification plays an important role in recommendation systems, organizing audio libraries, and discovering trends and preferences. Machine Learning techniques have proved useful in music analysis and in classifying music clips into different genres. In this paper, various machine learning algorithms like NN, RNN, CNN, SVC, and RFC are implemented and the performance of these algorithms is measured in terms of accuracy scores and listening tests. According to the author, the CNN model creates a binary mask for each source and it is used to create the target-extracted audio file. The author measures the performance by using accuracy scores and BSS-eval metrics along with listening tests. Bass masking was found to be the best with this CNN-based masking model and the generated masks for vocals, drums and accompaniment do work but they don’t completely extract the target components.

In paper [7] The authors of this paper believe that one of the fascinating subjects in the area of Music Information Retrieval (MIR) is the classification of music as it is played into different genres. The author has used Machine learning for the analysis. To predict the genre of the audio signals, models such as Support Vector Machines (SVM), Random Forests, XGB (Extreme Gradient Boosting), and Convolutional Neural Networks (CNN) were used by the author. The GTZAN dataset was used for model training and testing. The author describes that Machine learning and deep learning models each had their own set of features. A comparison analysis was proposed by the author between these models, demonstrating that CNN outperforms the machine learning model.

III. METHODOLOGY

A. Support Vector Machine Algorithm

Support Vector Machine Algorithm is one of the supervised Machine Learning. It is useful for solving both regression and classification problem statements. Now let us consider the classification problem to just understand the Geometrical intuition since it is a classification problem here we can easily separable two class points. We can separable these points with a hyperplane. From Fig. [1]. SVM makes sure that when we are creating a hyperplane apart from that it also creates two margin lines these two margin lines will have some distance so that they will be linearly separable for both classification points. SVM is also used for Regression problems to maintain the main features that the algorithm is characterized. The regression problem is the same as the classification problem with only minor changes.

B. Logistic Regression Algorithm

Logistic Regression is one of the simple and commonly used Machine Learning algorithms for two-classes classification. It predicts the probability of a binary event utilizing a logic function. Logistic Regression is a regression model to predict the probability for a given data entry that belongs the category number one just like linear regression. According to Fig. [2] using the sigmoid function logistic regression models the data and provides constant output. It is used to predict the probability of the dependent variable. This dependent variable has only two necessary classes. In the dependent variable data is coded as either 1 or 0.

IV. RESULTS AND DISCUSSIONS

Data collection: Collect a dataset of music audio files that are labeled with their respective genres. The dataset should be diverse and representative of the different music genres that are of interest.

Feature extraction: Extracting relevant features from the audio signals, such as MFCCs, spectral features, and rhythm features. The features should capture the important characteristics of the music that distinguish it between different genres.

Dataset splitting: Splitting the dataset into training, validation, and testing sets. The training set is used to train the machine learning model, the validation set is used to fine-tune the hyperparameters of the model, and the testing set is used to evaluate the performance of the model on unseen data.

Model selection: Select an appropriate machine learning algorithm for the task of music genre classification, such as SVM or Logistic regression algorithm.

Model training: Training the machine learning model on the training set using the extracted features and the corresponding labels.

Model evaluation: Evaluate the performance of the model on the validation set and adjust the hyperparameters of the model if necessary.

Testing and analysis: Testing the final model on the testing set and analyzing the performance metrics such as accuracy, precision, and recall. The results should be reported and compared to existing systems or literature.

Deployment: Deploying the trained model to classify new music audio files into different genres in real time.

A machine learning model that can accurately classify music audio files into different genres and provide a reliable and efficient solution for music genre classification that can be used in a wide range of applications, including music recommendation systems, music search engines, and music streaming platforms.

Conclusion

The built model such as SVM and Logistic Regression can predict/ classify the genre of the songs automatically. The classification is based on limited genres of music. Across all models, using frequency-based Mel-spectrograms produced higher accuracy results. Whereas amplitude only provides information on intensity, or how “loud\" a sound is.

References

[1] Karatana and O. Yildiz, \"Music genre classification with machine learning techniques,\" 25th Signal Processing and Communications Applications Conference (SIU), pp. pp. 1-4, 2017. [2] S. Vishnupriya and K. Meenakshi, \"Automatic Music Genre Classification using Convolution Neural Network,\" International Conference on Computer Communication and Informatics (ICCCI), pp. pp. 1-4, 2018. [3] A. Ghildiyal, K. Singh and S. Sharma, \"Music Genre Classification using Machine Learning,\" 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. pp. 1368- 1372, 2020. [4] J. K. Bhatia, R. D. Singh, and S. Kumar, \"Music Genre Classification,\" 5th International Conference on Information Systems and Computer Networks (ISCON), pp. pp. 1-4, 2021. [5] P. Devaki, A. Sivanandan, R. S. Kumar and M. Z. Peer, \"Music Genre Classification and Isolation,\" International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), pp. pp. 1-6, 2021. [6] N. Ndou, R. Ajoodha, and A. Jadhav, \"Music Genre Classification: A Review of Deep-Learning and Traditional Machine-Learning Approaches,\" IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. pp. 1-6, 2021. [7] M. Shah, N. Pujara, K. Mangaroliya, L. Gohil, T. Vyas and S. Degadwala, \"Music Genre Classification using Deep Learning,\" 6th International Conference on Computing Methodologies and Communication (ICCMC), pp. pp. 974-978, 2022.

Copyright

Copyright © 2023 Sheetal Janthakal, Bharath P I, Bineeth Rathod H A, B Sai Venkata Chaitanya, Bhardwaj Tejaswi G. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET51683

Publish Date : 2023-05-06

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here